本文介绍了我们在Aaai 2022的多模态事实验证(Factifify)挑战的参与者系统。尽管最近基于文本的验证技术和大型预训练的多模式模型的跨视野和语言,但在申请方面取得了非常有限的工作自动化事实检查过程的多模式技术,特别考虑到社交媒体上的图像和视频的索赔和假新闻的普遍存在。在我们的工作中,挑战被视为多式联版征报任务并被诬陷为多级分类。提出并探索了两个基线方法,包括集合模型(组合两个Uni-Modal模型)和多模态注意力网络(在索赔和证据文件中建模图像和文本对之间的交互)。我们在这项工作中进行了调查和基准测试和基准测试的几个实验和基准测试。我们的最佳型号在排行榜中排名第一,在验证和测试集中获得0.77的加权平均f测量值。对DataSet的探索性分析也在辅助数据集上进行,并揭示了激励我们假设的突出模式和问题(例如,单词重叠,视觉着色相关性,来源偏见)。最后,我们突出了未来研究的任务和多模式数据集的挑战。
translated by 谷歌翻译
Modelling the temperature of Electric Vehicle (EV) batteries is a fundamental task of EV manufacturing. Extreme temperatures in the battery packs can affect their longevity and power output. Although theoretical models exist for describing heat transfer in battery packs, they are computationally expensive to simulate. Furthermore, it is difficult to acquire data measurements from within the battery cell. In this work, we propose a data-driven surrogate model (LiFe-net) that uses readily accessible driving diagnostics for battery temperature estimation to overcome these limitations. This model incorporates Neural Operators with a traditional numerical integration scheme to estimate the temperature evolution. Moreover, we propose two further variations of the baseline model: LiFe-net trained with a regulariser and LiFe-net trained with time stability loss. We compared these models in terms of generalization error on test data. The results showed that LiFe-net trained with time stability loss outperforms the other two models and can estimate the temperature evolution on unseen data with a relative error of 2.77 % on average.
translated by 谷歌翻译
Steerable convolutional neural networks (CNNs) provide a general framework for building neural networks equivariant to translations and other transformations belonging to an origin-preserving group $G$, such as reflections and rotations. They rely on standard convolutions with $G$-steerable kernels obtained by analytically solving the group-specific equivariance constraint imposed onto the kernel space. As the solution is tailored to a particular group $G$, the implementation of a kernel basis does not generalize to other symmetry transformations, which complicates the development of group equivariant models. We propose using implicit neural representation via multi-layer perceptrons (MLPs) to parameterize $G$-steerable kernels. The resulting framework offers a simple and flexible way to implement Steerable CNNs and generalizes to any group $G$ for which a $G$-equivariant MLP can be built. We apply our method to point cloud (ModelNet-40) and molecular data (QM9) and demonstrate a significant improvement in performance compared to standard Steerable CNNs.
translated by 谷歌翻译
While the interaction of ultra-intense ultra-short laser pulses with near- and overcritical plasmas cannot be directly observed, experimentally accessible quantities (observables) often only indirectly give information about the underlying plasma dynamics. Furthermore, the information provided by observables is incomplete, making the inverse problem highly ambiguous. Therefore, in order to infer plasma dynamics as well as experimental parameter, the full distribution over parameters given an observation needs to considered, requiring that models are flexible and account for the information lost in the forward process. Invertible Neural Networks (INNs) have been designed to efficiently model both the forward and inverse process, providing the full conditional posterior given a specific measurement. In this work, we benchmark INNs and standard statistical methods on synthetic electron spectra. First, we provide experimental results with respect to the acceptance rate, where our results show increases in acceptance rates up to a factor of 10. Additionally, we show that this increased acceptance rate also results in an increased speed-up for INNs to the same extent. Lastly, we propose a composite algorithm that utilizes INNs and promises low runtimes while preserving high accuracy.
translated by 谷歌翻译
Datacenter operators ensure fair and regular server maintenance by using automated processes to schedule maintenance jobs to complete within a strict time budget. Automating this scheduling problem is challenging because maintenance job duration varies based on both job type and hardware. While it is tempting to use prior machine learning techniques for predicting job duration, we find that the structure of the maintenance job scheduling problem creates a unique challenge. In particular, we show that prior machine learning methods that produce the lowest error predictions do not produce the best scheduling outcomes due to asymmetric costs. Specifically, underpredicting maintenance job duration has results in more servers being taken offline and longer server downtime than overpredicting maintenance job duration. The system cost of underprediction is much larger than that of overprediction. We present Acela, a machine learning system for predicting maintenance job duration, which uses quantile regression to bias duration predictions toward overprediction. We integrate Acela into a maintenance job scheduler and evaluate it on datasets from large-scale, production datacenters. Compared to machine learning based predictors from prior work, Acela reduces the number of servers that are taken offline by 1.87-4.28X, and reduces the server offline time by 1.40-2.80X.
translated by 谷歌翻译
We present an approach for safe trajectory planning, where a strategic task related to autonomous racing is learned sample-efficient within a simulation environment. A high-level policy, represented as a neural network, outputs a reward specification that is used within the cost function of a parametric nonlinear model predictive controller (NMPC). By including constraints and vehicle kinematics in the NLP, we are able to guarantee safe and feasible trajectories related to the used model. Compared to classical reinforcement learning (RL), our approach restricts the exploration to safe trajectories, starts with a good prior performance and yields full trajectories that can be passed to a tracking lowest-level controller. We do not address the lowest-level controller in this work and assume perfect tracking of feasible trajectories. We show the superior performance of our algorithm on simulated racing tasks that include high-level decision making. The vehicle learns to efficiently overtake slower vehicles and to avoid getting overtaken by blocking faster vehicles.
translated by 谷歌翻译
We present a toolchain for solving path planning problems for concentric tube robots through obstacle fields. First, ellipsoidal sets representing the target area and obstacles are constructed from labelled point clouds. Then, the nonlinear and highly nonconvex optimal control problem is solved by introducing a homotopy on the obstacle positions where at one extreme of the parameter the obstacles are removed from the operating space, and at the other extreme they are located at their intended positions. We present a detailed example (with more than a thousand obstacles) from stereotactic neurosurgery with real-world data obtained from labelled MPRI scans.
translated by 谷歌翻译
The upcoming exascale era will provide a new generation of physics simulations. These simulations will have a high spatiotemporal resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible. Therefore, we need to rethink the training of machine learning models for simulations for the upcoming exascale era. This work presents an approach that trains a neural network concurrently to a running simulation without storing data on a disk. The training pipeline accesses the training data by in-memory streaming. Furthermore, we apply methods from the domain of continual learning to enhance the generalization of the model. We tested our pipeline on the training of a 3d autoencoder trained concurrently to laser wakefield acceleration particle-in-cell simulation. Furthermore, we experimented with various continual learning methods and their effect on the generalization.
translated by 谷歌翻译
We propose to characterize and improve the performance of blind room impulse response (RIR) estimation systems in the context of a downstream application scenario, far-field automatic speech recognition (ASR). We first draw the connection between improved RIR estimation and improved ASR performance, as a means of evaluating neural RIR estimators. We then propose a GAN-based architecture that encodes RIR features from reverberant speech and constructs an RIR from the encoded features, and uses a novel energy decay relief loss to optimize for capturing energy-based properties of the input reverberant speech. We show that our model outperforms the state-of-the-art baselines on acoustic benchmarks (by 72% on the energy decay relief and 22% on an early-reflection energy metric), as well as in an ASR evaluation task (by 6.9% in word error rate).
translated by 谷歌翻译
Humans and animals excel in combining information from multiple sensory modalities, controlling their complex bodies, adapting to growth, failures, or using tools. These capabilities are also highly desirable in robots. They are displayed by machines to some extent. Yet, the artificial creatures are lagging behind. The key foundation is an internal representation of the body that the agent - human, animal, or robot - has developed. The mechanisms of operation of body models in the brain are largely unknown and even less is known about how they are constructed from experience after birth. In collaboration with developmental psychologists, we conducted targeted experiments to understand how infants acquire first "sensorimotor body knowledge". These experiments inform our work in which we construct embodied computational models on humanoid robots that address the mechanisms behind learning, adaptation, and operation of multimodal body representations. At the same time, we assess which of the features of the "body in the brain" should be transferred to robots to give rise to more adaptive and resilient, self-calibrating machines. We extend traditional robot kinematic calibration focusing on self-contained approaches where no external metrology is needed: self-contact and self-observation. Problem formulation allowing to combine several ways of closing the kinematic chain simultaneously is presented, along with a calibration toolbox and experimental validation on several robot platforms. Finally, next to models of the body itself, we study peripersonal space - the space immediately surrounding the body. Again, embodied computational models are developed and subsequently, the possibility of turning these biologically inspired representations into safe human-robot collaboration is studied.
translated by 谷歌翻译